Here, we’re just setting a few options.

knitr::opts_chunk$set(
  warning = TRUE, # show warnings during codebook generation
  message = TRUE, # show messages during codebook generation
  error = TRUE, # do not interrupt codebook generation in case of errors,
                # usually better for debugging
  echo = TRUE  # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())

Now, we’re preparing our data for the codebook.

library(codebook)
codebook_data <- codebook::bfi
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
codebook_data <- rio::import("national_parks/data/park_biodiversity_data/parks.csv")

# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
    only_labelled = TRUE, # only labelled values are autodetected as
                                   # missing
    negative_values_are_missing = FALSE, # negative values are missing values
    ninety_nine_problems = TRUE,   # 99/999 are missing values, if they
                                   # are more than 5 MAD from the median
    )

# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- detect_scales(codebook_data)

Create codebook

codebook(codebook_data)
## No missing values.

Metadata

Description

Dataset name: codebook_data

The dataset has N=56 rows and 6 columns. 56 rows have no missing values on any column.

Metadata for search engines
  • Date published: 2021-04-29
x
Park Code
Park Name
State
Acres
Latitude
Longitude

#Variables

Park Code

Distribution

Distribution of values for Park Code

Distribution of values for Park Code

0 missing values.

Summary statistics

name data_type n_missing complete_rate n_unique empty min max whitespace label
Park Code character 0 1 56 0 4 4 0 NA

Park Name

Distribution

Distribution of values for Park Name

Distribution of values for Park Name

0 missing values.

Summary statistics

name data_type n_missing complete_rate n_unique empty min max whitespace label
Park Name character 0 1 56 0 18 46 0 NA

State

Distribution

Distribution of values for State

Distribution of values for State

0 missing values.

Summary statistics

name data_type n_missing complete_rate n_unique empty min max whitespace label
State character 0 1 27 0 2 10 0 NA

Acres

Distribution

Distribution of values for Acres

Distribution of values for Acres

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
Acres numeric 0 1 5550 238764 8323148 927929.1 1709258 ▇▁▁▁▁ NA

Latitude

Distribution

Distribution of values for Latitude

Distribution of values for Latitude

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
Latitude numeric 0 1 19 39 68 41.23393 10.90883 ▂▇▅▁▂ NA

Longitude

Distribution

Distribution of values for Longitude

Distribution of values for Longitude

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
Longitude numeric 0 1 -159 -111 -68 -113.2348 22.44029 ▂▁▇▂▂ NA

Missingness report

Codebook table

JSON-LD metadata

The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.

{
  "name": "codebook_data",
  "datePublished": "2021-04-29",
  "description": "The dataset has N=56 rows and 6 columns.\n56 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name      |label | n_missing|\n|:---------|:-----|---------:|\n|Park Code |NA    |         0|\n|Park Name |NA    |         0|\n|State     |NA    |         0|\n|Acres     |NA    |         0|\n|Latitude  |NA    |         0|\n|Longitude |NA    |         0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
  "keywords": ["Park Code", "Park Name", "State", "Acres", "Latitude", "Longitude"],
  "@context": "http://schema.org/",
  "@type": "Dataset",
  "variableMeasured": [
    {
      "name": "Park Code",
      "@type": "propertyValue"
    },
    {
      "name": "Park Name",
      "@type": "propertyValue"
    },
    {
      "name": "State",
      "@type": "propertyValue"
    },
    {
      "name": "Acres",
      "@type": "propertyValue"
    },
    {
      "name": "Latitude",
      "@type": "propertyValue"
    },
    {
      "name": "Longitude",
      "@type": "propertyValue"
    }
  ]
}`